Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🛡️ AI Security
Model Poisoning, Adversarial Examples, Prompt Injection, AI Safety
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
28333
posts in
66.1
ms
Mechanistic
Steering
of LLMs Reveals Layer-wise Feature Vulnerabilities in Adversarial Settings
🕳
LLM Vulnerabilities
arxiv.org
·
2d
The
Agentic
AI Security Company
💻
Coding Agents
straiker.ai
·
3d
·
Hacker News
Research
Sabotage
in ML
Codebases
🔎
AI Auditing
lesswrong.com
·
13h
The Five
Horsemen
of Prompt Injection: A Technical Deep Dive into LLM Attack
Vectors
💉
Prompt Injection
pub.towardsai.net
·
1d
AI-Augmented
Social Engineering: When Trust
Becomes
a Control-Plane Risk
🛡️
AI Safety
zenodo.org
·
4d
·
Hacker News
ML Safety Newsletter #20: AI Wellbeing,
Classifier
Jailbreaking
and Honest Pushback Benchmarking
🛡️
AI Safety
lesswrong.com
·
1d
Agentic Adversarial
Rewriting
Exposes Architectural Vulnerabilities in Black-Box
NLP
Pipelines
💻
Coding Agents
arxiv.org
·
2d
Third
Symposium
on
AIT
& ML: AI Safety Applications
🛡️
AI Safety
lesswrong.com
·
4d
AI companies should
publish
security
assessments
🔎
AI Auditing
lesswrong.com
·
2d
From
Stateless
Queries to Autonomous Actions: A
Layered
Security Framework for Agentic AI Systems
💻
Coding Agents
arxiv.org
·
2d
AI safety can be a
Pascal
's
mugging
even if p(doom) is high
🛡️
AI Safety
lesswrong.com
·
4d
Evaluation of Prompt
Injection
Defenses
in Large Language Models
💉
Prompt Injection
arxiv.org
·
2d
Thoughts
on AI Safety
Megagame
Design
🆕
New AI
lesswrong.com
·
5d
Unveiling the
Backdoor
Mechanism Hidden Behind Catastrophic
Overfitting
in Fast Adversarial Training
🛡️
AI Safety
arxiv.org
·
2d
Spontaneous
introspection
in conversation
tampering
💉
Prompt Injection
lesswrong.com
·
3d
Poster:
ClawdGo
:
Endogenous
Security Awareness Training for Autonomous AI Agents
🛡️
AI Safety
arxiv.org
·
2d
RouteGuard
: Internal-Signal Detection of Skill
Poisoning
in LLM Agents
💉
Prompt Injection
arxiv.org
·
2d
Semantic
Denial
of Service in
LLM-controlled
robots
💉
Prompt Injection
arxiv.org
·
1d
Breaking MCP with
Function
Hijacking
Attacks: Novel Threats for
Function
Calling and Agentic Models
🕳
LLM Vulnerabilities
arxiv.org
·
6d
SMSI
: System Model Security Inference: Automated Threat
Modeling
for Cyber-Physical Systems
🛡️
AI Safety
arxiv.org
·
2d
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help